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Abstract. We examine the equations to obtain atomic pair distribution functions 
\ (PDFs) from x-ray, neutron and electron powder diffraction data with a view to 

obtaining reliable and accurate PDFs from the raw data using a largely ad hoc 
correction process. We find that this should be possible under certain circumstances 
^ ' that hold, to a reasonably good approximation, in many modern experiments. We 

describe a variational approach that could be applied to find data correction parameters 
that is highly automatable and should require little in the way of user inputs yet results 
■^J- ' in quantitatively reliable PDFs, modulo unknown scale factors that are often not of 

scientific interest when profile fitting models are applied to the data with scale-factor 
as a parameter. We have worked on a particular implementation of these ideas and 
demonstrate that it yields PDFs that are of comparable quality to those obtained with 
existing x-ray data reduction program PDFgctX2. This opens the door to rapid and 
highly automated processing of raw data to obtain PDFs. 
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1. Introduction 

Total scattering analysis and the closely related atomic pair distribution function (PDF) 
method, are growing in popularity in the area of nanostructure determination jTJ [21 [3] , 
where the PDF is the Fourier transform of the total scattering structure function. 
Total scattering data and the PDF can be obtained from x-ray, neutron [H El E] and 
electron [7J [8] data from isotropically scattering samples such as crystalline powders, 
nanoparticles, amorphous materials and liquids. It contains information about structure 
at the nanoscale since it utilizes both Bragg and diffuse scattering intensities which 
contribute information about the average and local structures, respectively. With the 
advent of high power x-ray and neutron sources with optimized PDF instruments [91IT0]. 
the emerging realization that for nanoparticles quantitatively reliable PDFs can be 
obtained from electron diffraction data [8j, and the maturing of sophisticated computer 
based modeling programs [TTJ [121 1131 [HI US [16], the use of the PDF is expected to have 
a significant impact in the area of nanostructure characterization in the coming years. 

In order to obtain the total scattering structure function and the PDF from the data, 
significant corrections have to be made to the raw data, as well as proper normalization, 
before Fourier transforming to obtain the PDF, G(r), as described in detail in Chapter 
5 of [51 E]- Computer programs exist for doing this [TTl HH [191 120], but it remains 
a tedious and problematic process and a barrier to broader adoption of the method. 
Here we investigate whether quantitatively reliable PDFs can be obtained from powder 
diffraction data using purely ad hoc corrections. "Quick and dirty" PDFs have been 
obtained in this way for some time [5]. We explore this in more detail and show that, if 
certain experimental conditions hold, not only "quick and dirty" but accurate PDFs may 
be obtained using completely ad hoc correction methods and we propose a variational 
approach that should allow quantitatively correct reduced structure functions and PDFs 
to be obtained this way, modulo a global scale factor on the intensities and the atomic 
displacement parameters. We have implemented this approach in a program, the details 
of which will be described elsewhere. However, we reproduce a figure here that direct 
demonstrates the promise of this approach by yielding x-ray PDFs of comparable quality 
to those obtained from the widely used PDFgetX2 program [17]. This has the potential 
to greatly simplify total scattering and PDF studies, for example, facilitating real-time 
data processing during data collection. 

2. data reduction 

We will first discuss how the total scattering structure function is typically obtained in 
x-ray and neutron diffraction. This process has an established theoretical foundation [5] . 
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2.1. X-ray and neutron diffraction data reduction 

The reduced total scattering structure function, F(Q), is defined in terms of the total 
scattering structure function, S(Q), as 

F(Q) = Q(S(Q)-1). (1) 

The structure function contains the discrete coherent singly scattered information 
available in the raw diffraction intensity data. It is defined according to [4J 

q(n , _ UQ) (U'-(f)) 2 ) (0 s 
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which gives [21] 
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where / is the Q-dependent x-ray or electron scattering factor or Q-independent neutron 
scattering length, as appropriate, and (. . .) represents an average over all atoms in the 
sample. In this equation, I C (Q) is the coherent single-scattered intensity per atom and 
Id(Q) is the discrete coherent scattering intensity, which excludes the self-scattering, 
N(f 2 ) [21]. The coherent scattering intensity is obtained from the measured intensity by 
removing parasitic scattering (e.g., from sample environments), incoherent and multiple 
scattering contributions, and correcting for experimental effects such as absorption, 
detector efficiencies, detector dead-time and so on [5] . The resulting corrected measured 
intensity is normalized by the incident flux to obtain I C (Q). The self-scattering, N{f 2 ), 
and normalization, N(f) 2 , terms are calculated from the known composition of the 
sample using tabulated values of /. 

As evident in Eq. [21 to obtain S(Q) — 1 from I C (Q) we subtract the self-scattering, 
N(f 2 ), which has no atom-pair correlation information, and divide by N(f) 2 . As a 
result, S(Q) — 1 oscillates around zero, and asymptotically approaches it at high Q 
as the coherence of the scattering is lost. If the experimental effects are removed 
correctly, the resulting F(Q) and G(r) are directly related to, and can be calculated 
from, structural models [21J . The corrections are well controlled in most cases and 
refinements of structural models result in reduced x 2 values that approach unity in the 
best cases. Some uncertainty in the corrections can be tolerated. This is due to a 
somewhat fortuitous circumstance that they are mostly long- wavelength in nature, such 
as the Compton scattering correction in the case of x-rays, whereas the signal from the 
structure is much higher frequency in Q. If these long wavelength contributions are not 
correctly removed they result in correspondingly long-wavelength aberrations to S(Q) 
that appear in G(r) as peaks in the very low-r region below any physically meaningful 
PDF peaks [20]. 
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There are various programs for obtaining PDF from raw data, such as 
PDFgetX2 [17], RAD [18J and GudrunX | for x-rays and PDFgetN [20] and Gudrun 1 for 
time of flight neutrons. These programs provide excellent results but require numerous 
data inputs and user interactions and are difficult to learn, with multi-day workshops 
sometimes being dedicated to learning their use. If a much simpler data correction 
protocol could be found that resulted in PDFs of comparable quality, but which could 
be automated, it would potentially greatly expand and assist the PDF community. 
Here we explore whether a protocol can be found using completely ad hoc corrections 
that can result in quantitatively reliable PDFs. To be explicit, we seek the actual 
reduced structure function F(Q) from a sample, given the measured powder diffraction, 
I m (Q). In a conventional data reduction, we begin by finding the coherent scattered 
intensity, I C (Q) from I m (Q) by making corrections for things such as detector deadtime, 
polarization, multiple scattering, backgrounds, and so on. The reduced structure 
function is then determined from Eqs. HJ |2j and [3j 

Apart from the detector dead-time correction, all the corrections are either simply 
additive or multiplicative. If we assume that any detector dead-time is negligible or has 
been corrected before getting I m , we can write 



Where a and b are the generalized (and unknown) Q-dependent multiplicative and 
additive, respectively, correction functions. It is these additive and multiplicative 
corrections that are explicitly calculated from theory [5j and applied in the PDF 
data reduction programs mentioned above based on detailed user inputs about the 
experimental conditions. 

Careful inspection of Eqs. [TJ El and |3] shows that we can also write an expression 
for F(Q) itself in the same form. 



without loss of generality. 

Writing the equations this way is of no particular advantage because we don't 
know the form, or the Q-dependence, of a and (3. However, we do have considerable 
information about the nature and asymptotic behavior of F(Q) and we do have some 
information about the nature of the physical corrections that combine to make a and 
/3. Here we show how, in principle, this can be used to determine the properly corrected 
F(Q) with minimal input information. 

Careful consideration of the nature of the structural and non structural components 
to the measured signal suggests that there is a good separation between the frequency of 
most corrections and the frequency of the structural information in the PDF. The lowest 
frequency Fourier component in F(Q) coming from a real structural signal is ~ 2ir/r nn 

| available from the ISIS disordered materials group website, 

|http://http://www.isis. stfc.ac.uk/instruments/sandals/data-analysis/gudrun 8864. html 



I c = a(Q)I m (Q) + KQ), 



(4) 



F(Q) = a(Q)I m (Q) + 0(Q) 



(5) 



Ad-hoc data correction approach for reliable atomic pair distribution functions 



5 



where r nn is the length of the shortest inter-atomic bond-length. This means that all 
additive frequency components in the signal that have lower frequency than this are 
certainly coming from non-structural contributions to the signal. On the other hand, 
as we discussed above, the additive contributions to the signal coming from extrinsic 
sources are predominantly much longer wavelength and more slowly varying than this. 

If we assume for the moment that the multiplicative corrections have all been 
correctly applied to I m (i.e., set a(Q) to unity) we could fit a smooth curve that has only 
frequency components higher than 2ir/r nn through the data and subtract it. This will 
result in a function that has the correct asymptotic behavior as F(Q), oscillating around 
zero, and actually is mF(Q) if there are no experimental aberrations with frequency 
components higher than 2ir/r nn , where m is an unknown constant that affects the scale 
of the resulting F(Q) but not its shape. A similar approach has been used for many 
years as a post-facto correction to clean-up unwanted oscillations in the low-r region of 
the PDF. In this approach the low-r ripples are back-Fourier transformed to Q-space 
and then this signal is subtracted from the F(Q) before again Fourier transforming the 
corrected F(Q) resulting in a cosmetically improved PDF. However, here we argue that 
a completely ad hoc correction can give PDFs of comparable quality to those obtained 
by traditional approaches but with much less user and computational effort, modulo 
an unknown scale factor. This last fact means that other information must be used to 
obtain the correct absolute scale for the data. However, in many cases this information 
available from other sources and structure refinement programs such as PDFgui refine 
scale factor as a variable and it is not fixed in any case. 

This low-frequency requirement holds in practice very well for all the additive 
corrections except for those that actually contain structural information themselves, 
such as scattering from the sample container. However, if sample container scattering 
is significant it can be measured and subtracted fairly straightforwardly. 

We now consider the effect on the PDF of the multiplicative term, a(Q). For 
reference, consider the ideal correlation function, Gy(ry) from a single atom pair 
situated a distance apart. Using Debye's equation [22] for the coherent scattering 
amplitude, 




(6) 



we get Fij(Q) corresponding to the single peak PDF as 



(7) 
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This gives for the PDF [2T 



[ Q ™* sm{Q rij ) . 
Gij[r) oc / —sm(Qr)dQ 

J Qmin 



sin((r - Tij)Q max) sin((r + rij)Q max ) 



r - nj r + nj 

sin((r - Vj^Qmin) sin((r + r i:j )Q m i n ) 

* . 5 



r - Tij r + Tij 

which is the sum of two signals, one with maximum at r^, and the other at — r^. We 
ignore the contribution from Q m i n , which oscillates much slower than the contributions 
from Qmax- In general we only compute PDF on the positive axis and the contribution 
on the positive axis to the peak centered at —r^ is ignored with little loss in accuracy 
as its contribution on the positive axis is small [23]. As we expect, the PDF is a peak 
at the position ry but with the characteristics of a Sine function due to the finite 
Fourier transform. The cental peak of the Sine function has a FWHM that is inversely 
proportional to the width of the window in Q-space with intensity tails that die off as 
1/r away from the peak on the low- and high-r sides modulated by an oscillation with a 
wavelength of 1/Q ma x- It is shown in Fig. [T]for various values of Q ma x- For large Q ma x 
these signals approach Dirac-delta functions centered at ±r^-. Without taking into 
consideration peak broadening due to thermal fluctuations, the finite width of these 
peaks is solely due to the finite Qmax- 
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Next we consider the effect of multiplicative distortions ot'(Q) on this signal. In 
other words, we assume that the multiplicative corrections have not been done correctly 
and we Fourier transform 

F'(Q) = a'(Q)F(Q) (9) 

instead of F(Q) itself. 

In the best case, ct'{Q) is constant and it scales the peaks of the correlation function 
uniformly, which does not distort the structural information. Models fit to data that is 
distorted only by a constant scale factor gave equivalent structural results provided a 
constant scale-factor could be refined in the model [20] . 

Now let us consider the effects on the PDF, G(r), of a Q-dependent a'(Q). To do 
this we assume that ct'(Q) has a convergent Fourier series expansion over the interval 
[0,Q^ a J, an d that Q' max > Qmax- This means that the longest wavelength component 
of a(Q) may be greater than the extent of the measured signal. We express the Fourier 
expansion of a'{Q) as 



*'(Q) = f + £ U - (g-Q) + K sin ( §^q) ) 



~K + ^ C ° S ( r «^) + bn Sil1 ( r 'nQ)) 



n=l 

CO 



(10) 



2 



71=1 



which serves to define r' n . Long wavelength Fourier components in a'(Q) correspond to 
small values of r' n . We only need to consider the cosine components of a'(Q), because 
the sine components do not contribute to G(r) due to the sine Fourier transform. Thus, 
for a given n, we have for our F'(Q) of a single peak PDF 

F'(Q)oca ra cos(r;Q) Sin(gr ^ 

(11) 



On 

2 



sm(Q( rij + r' n )) smjQjrjj - r' n )) 



Here, we have used trigonometric identities to go from the first line to the second line. 
Putting this into the form of Eq. [TJ 



a n I . r n 



HQ) -7? 1 



2\ ty . . / ,>i , . I ry' 



+ ~n | 1 _ < ^ sin(Q(r ii - <)) 



sin(Q(ry + r' n )) 

(12) 
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Comparing Eq. [T2] to Eq. [7J we see that instead of a single peak at r^ in G(r), 
Eq. [12] produces two peaks of almost equal intensity, one at r^ + r' n and one at r^ — r' n . 
In actuality, the precise amplitudes of these two peaks are not the same; the peak at 
r^ + r' n is larger than the one at r^ — r' n , and the amplitude difference is 2r' n /rij. 

As we discussed earlier, we expect most aberrations coming from imperfect 
multiplicative corrections to be long-wavelength, for example, extinction and absorption 
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corrections. These only have Fourier components with small r' n , in the limit r' n <C r^. In 
this case, the two distinct Sine peaks would appear as a single unresolved but broadened 
peak close to the position of the undistorted PDF peak at r^. As r'Jr^ gets larger, the 
peak would further broaden, shift slightly and become asymmetric due to the amplitude 
difference of the signals. The position of the maximum of an asymmetric peak would 
be larger than due to this asymmetry, and asymmetry would be more pronounced 
for peaks at lower r. The combined influence from multiple Fourier components would 
smear out any peak splitting but accentuate the asymmetry of the peak. Thus, the 
effect of all imperfectly corrected multiplicative aberrations to F(Q) is to broaden, and 
skew peaks in the PDF. We are familiar with this peak broadening effect as the same 
effect as given by the convolution theorem of Fourier transforms. The peak in r-space is 
being broadened by the convolution of the Fourier transform of ct'(Q) with the pristine 
PDF peak. 

Because all long-wavelength aberrations in the data result in a broadening of the 
peaks, this suggests that an ad hoc variational approach could be found to search for 
the Fourier coefficients of some unknown a /(Q) where these are adjusted in such a way 
as to make the resulting PDF peaks as sharp and as symmetric as they can be. This 
could be automated in a regression scheme. A challenge here is that there is also peak 
broadening in the data that has real physical significance: the thermal motions and 
static disorder. Indeed, this broadening is produced by a low-frequency multiplicative 
factor applied to the intensity in Q-space, the Debye- Waller factor [1]. Thus, unlike 
the case of the additive corrections, there is not a clean separation in frequency of the 
physical signal and the experimental aberrations that we can exploit here. It is possible 
that a scheme could be found to separate the contributions by applying some additional 
knowledge about the behavior of the different functions. For example, the Debye- Waller 
factor affects different PDF peaks differently depending on the atoms contributing to 
the peak whereas the data corrections do not have this chemical specificity. Thus, the 
relative atomic displacement factors could be recovered modulo an uncertain overall 
scale that may be obtained from other measurements. 

To test out these ideas we have created an implementation of the procedure. The 
program is described in detail elsewhere [24]. It models (3'(Q) as an polynomial of 
no more than 8 or 9 orders, where this number is chosen to ensure limit the highest 
frequency possible in (3'{Q) [21]. At this time a'(Q) is simply set to unity. This 
implementation has been tested on real data obtained in high energy rapid acquisition 
PDF mode on fine powders and nanomaterials. Under these conditions the absorption 
and extinction effects are expected to be small. The data were reduced from raw ID 
intensities to PDFs using the new procedure, and these PDFs were compared to the 
PDFs obtained from the same data using the PDFgetX2 program [17] . The comparisons 
are very good. The result for a representative sample of Ni is shown in Figure ??. 
For more details and more comparisons we point the reader to the publication on the 
program, PDFgetX3 [21]. Nonetheless, this comparison shows that this ad hoc approach 
to data corrections works rather well in practice and may be used to obtain quantitatively 
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Figure 2. Comparison of PDFs obtained using the new procedure and using 
PDFgetX2 which implements all the corrections explicitly. The data are from a 
high energy x-ray measurement of nickel powder. The PDF obtained with the ad 
hoc procedure is plotted in green and the one obtained using PDFgetX2 is shown in 
blue. The difference curve is offset below. The horizontal dashed lines are guides to 
the eye. See [24] for more details. 



reliable PDFs, at least under favorable experimental conditions. 
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